-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Map type annotations to source text #345
Conversation
This is looking far more successful than I expected. I believe it's working well enough that this PR is now heading for completion:
The one part that I think won't happen soon is in-source call navigation, as that involves fancy REPL features we currently lack. For now I think we'll have to live with the current SSA-based call list. We might at least be able to strip the SSAs and print line numbers which could be matched against line numbers printed alongside the source. CC @pfitzseb in case there's any of this that should be done mindfully of VS Code. |
I think this is a fabolous idea and effort. There are a few other low hanging UX efforts that could go well with this (like using Term.jl or something similar to implement paging for source code). The bigger question here for me is. Should this be the default mode? I am also feel that this transform Cthulhu to something else entirely and it might be better called Cuddulu or FluffyCthulhu. I love the name Cthulhu but it was also in part choosen to convey the notion of "all hope is lost, but I will persist in my quest for knowledge", that surrender of sanity in pursuit of elder wisdom xD |
The other question is, can we do a medium thing? I don't find it all to useful to just show the |
We could certainly split it out. We'd need a common core that implements all the wonderful work that's been done to ensure that inference is handled correctly. Would Cthulhu-proper then be much more than a UI package?
At present, the source will show In the long run, though, I had imagined a "cursor" over the source code, and you can dive into calls simply by moving the cursor to the right call and hitting |
At the risk of ruining the fun of a lovely joke, I wonder how many beginning coders might feel that way even about the "fluffy" mode this implements. "Why does Julia have types at all, it's so annoying?!?" That is, to sophisticated developers like you, the two/three/four/five-language problem that you need to be able to read and understand Julia+lowered Julia+type-inferred Julia+IRCode+LLVM+native code seems complex, but for complete newbies even the notion of tracing types through a program may be a level of complexity that just about breaks them, even when we try to help them as much as we can. If there's more than a grain of truth in that, then perhaps the name Cthulhu is still appropriate, considering the audiences. "Cthulhu: think it's easy? Wait until you try to gain the next level of insight!" |
Since VS Code was brought up: julia-vscode/LanguageServer.jl#1077 is somewhat relevant. As always, with the current design of the language server, there's a bit of a tension between wanting to get type info statically from the source code, but having a full REPL context getting us much further. JET.jl does have integration for "runtime diagnostics"; we could conceivably do something similar for Cthulhu as well. |
Yeah I sometimes wonder if Cthulhu as a name is not just another barrier to entry. Hard to spell, hard to pronounce, comes from a fantasy canon with questionable pedigree... It captured very well my emotional state at the time of first draft, but it might not be the best name, |
I do see your point. Splitting the fluffy version out would be very easy and provide an opportunity for a rename, reserving the scarier mode for deep analysis. I think it is fitting & best to have you make the call, so just let me know when you've decided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks cool! So great to see JuliaSyntax
in use :)
Some of the parts you're trying to use here are very under-documented in JuliaSyntax, and the APIs are still under-developed (SyntaxNode
and SourceFile
at least). Glad (not surprised!) to see you figured it out :)
The highlighting you're doing here looks very similar to the way diagnostics are show in JuliaSyntax and it looks like you might have rewritten some of the same tools? Check out how diagnostics are printed at https://github.com/JuliaLang/JuliaSyntax.jl/blob/e57b80044e3a863924909ef51e1370c381109e58/src/diagnostics.jl#L44
We should share tools for pretty printing upstream in JuliaSyntax ❤️
src/sourcetext.jl
Outdated
srctxt = node.source.code | ||
firstidx, _lastidx = first_byte(node), last_byte(node) | ||
if lastidx + 1 < firstidx # do any "overdue" printing now. Mostly, this catches whitespace | ||
print(io, srctxt[lastidx+1:firstidx-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This string indexing might be problematic for unicode? Especially the firstidx-1
which may not work for multi byte chars? JuliaSyntax.SourceFile
might help here in particular getindex(source::SourceFile, rng::AbstractRange)
since that's careful with string indices.
src/sourcetext.jl
Outdated
function show_src_expr!(io::IO, taken, src::CodeInfo, lineno::Int, node::SyntaxNode, lastidx::Int; | ||
iswarn::Bool=false, hide_type_stable::Bool=false) | ||
hd = head(node) | ||
srctxt = node.source.code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node.source
is a SourceFile
which might help here?
src/sourcetext.jl
Outdated
return lastidx | ||
end | ||
|
||
function stringify(calltok::SyntaxNode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to stringify the node exactly as it was in the original source, you can just use JuliaSyntax.sourcetext(calltok)
?
Co-authored-by: c42f <[email protected]>
Thanks @c42f! Your timing is excellent; just this morning I was thinking that I needed to rework all the printing here. The idea of upstreaming anything possible to JuliaSyntax sounds like the right design. In some of my local work (not yet pushed because it's a little too ugly still), I've started on the first point in my TODO list. Here's a screenshot of the current output: As you can see, I've started adding some warnings that some calls couldn't be found in the type-inferred code. The printing of the line number is unlikely to be permanent, it's mostly for me to check whether I've gotten the line number right. (In case you're curious, for the cases here, some of them can't be found because type-inference determines that they will never run (given the input types the condition is false), others because the code is actually split out into an anonymous function (which is a separate So one issue is that I'll need not only to print type-annotations after some |
Cool. Interspersing diagnostics is tricky. I so want to preserve the "shape of the original code" in pretty printing. But in plain text that conflicts with providing this extra diagnostic info. We discussed this back and forth a little in JuliaLang/JuliaSyntax.jl#150 I can't really see a way around some kind of annotations system which will break the shape of the original code. I had some ideas for fancier browser-based UI in JuliaLang/JuliaSyntax.jl#150 (still somewhat integrated with the terminal experience via ANSI escape code hyperlinks) but we'll still need something which works well in a plain terminal by default. |
Thanks for the many improvements, folks! |
OK, I think I've belatedly come up with a much better design. The idea is to orthogonalize the analysis (the matching between text & types) and the printing:
Note, the "unmached calls" could in fact be handled by defining a special type I welcome additional suggestions, so please share any that occur. But to me this seems obviously better than the approach I've taken here, so I'll probably get started on this tomorrow morning. |
I think I am ready to move forward with a more "final" implementation. I think the key question is how to organize the codebases. @vchuravy, have you decided whether you'd prefer the UI part to be separate from Cthulhu? If it should be separate, then I propose to create a package ("TypeBrowser"? "CodeTyped"? "CodeSurfer"?) in the JuliaDebug org that would have the Cthulhu-like "descending" UI but oriented towards source text; this repo could also have a subdir package ("TypedSyntax"?) that handles the matching between the trees produced by JuliaSyntax and type-inferred code. The output of "TypedSyntax" would be a If instead you want it to live under the Cthulhu umbrella, then I'd propose the "TypedSyntax" module could either be a standalone package (JuliaDebug? or other org?) or could live in this repo as a subdir package. The UI elements would be added to Cthulhu itself. |
This includes fairly trivial tests of the new Source mode. More should be added later.
Locally all tests pass; the only failure (on 1.9) is the fact that TypedSyntax can't be registered until after this merges. The tests of the new UI mode are minimal but probably good enough to merge. But I will hold off in case someone wants to review this. |
Also avoid showing "dump Params cache" option in `annotate_source` mode.
Some issues I found during playing with the latest commit:
|
s += x | ||
end | ||
return s | ||
end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we need to remove this semicolon otherwise we can't construct TypedSyntaxNode
:
julia> function summer(list)
s = 0
for x in list
s += x
end
return s
end;
julia> TypedSyntaxNode(summer, (Vector{Float64},))
ERROR: MethodError: no method matching iterate(::Nothing)
Closest candidates are:
iterate(::Union{LinRange, StepRangeLen})
@ Base range.jl:887
iterate(::Union{LinRange, StepRangeLen}, ::Integer)
@ Base range.jl:887
iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}}
@ Base dict.jl:716
...
Stacktrace:
[1] indexed_iterate(I::Nothing, i::Int64)
@ Base ./tuple.jl:93
[2] JuliaSyntax.TreeNode{TypedSyntax.TypedSyntaxData}(f::Any, t::Any; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ TypedSyntax ~/julia/packages/Cthulhu/TypedSyntax/src/node.jl:20
[3] JuliaSyntax.TreeNode{TypedSyntax.TypedSyntaxData}(f::Any, t::Any)
@ TypedSyntax ~/julia/packages/Cthulhu/TypedSyntax/src/node.jl:18
[4] top-level scope
@ REPL[33]:1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update your CodeTracking
Co-authored-by: Shuhei Kadowaki <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
I fixed the issues you noticed. It now falls back to the typed view if the source can't be found. (That's a very important change, thanks!)
s += x | ||
end | ||
return s | ||
end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update your CodeTracking
Co-authored-by: Shuhei Kadowaki <[email protected]>
Co-authored-by: Shuhei Kadowaki <[email protected]>
One thought I had: should I split this into two PRs?
|
I agree with it :) |
See #353. Future reviews & development of TypedSyntax should go there. |
The list of call sites serves two purposes: - it gives you options to descend into - it summarizes the calls made by your function Given the second purpose, I've sometimes been confused by the absence of an entry for the "worst" of all cases, runtime dispatch. This seems particularly accute for #345, but may have value even when examining the CodeInfo/IRCode.
The list of call sites serves two purposes: - it gives you options to descend into - it summarizes the calls made by your function Given the second purpose, I've sometimes been confused by the absence of an entry for the "worst" of all cases, runtime dispatch. This seems particularly accute for #345, but may have value even when examining the CodeInfo/IRCode.
The list of call sites serves two purposes: - it gives you options to descend into - it summarizes the calls made by your function Given the second purpose, I've sometimes been confused by the absence of an entry for the "worst" of all cases, runtime dispatch. This seems particularly acute for #345, but may have value even when examining the CodeInfo/IRCode.
This is an early-stage effort to make Cthulhu "newbie-friendly" by printing type-annotations against the raw source code. Timing-wise, there are several factors that led me to tackle & submit this now:
First a small (and very imperfect) demo is in order (EDIT: the demo below is outdated, the most up-to-date demo is in this branch's README):
(After this there is "cruft" from the fact that this source view has not been fully integrated so that it supersedes the current infrastructure.)
To hint at the option to suppress type-stable types,
Here it's only being applied to the signature, since very little else is type-stable in this example.
One really big caveat: I'm not sure this can be done properly until JuliaSyntax also handles lowering. The ugliest part of this by far is matching parsed expressions to their type-inferred equivalents. I'd particularly appreciate hearing thoughts on this issue.
In the long term future, it might be nice to allow the cursor to traverse the source text and enter into particular calls from there, without needing to print the list of calls.
CC @c42f